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Abstract 

By focusing on what can be observed by running 
tracer out e-like measurements at a high frequency from 
a single monitor to a fixed destination set, we show that the 
observed view of the topology is constantly evolving at a 
pace much higher than expected. Repeated measurements 
discover new IP addresses at a constant rate, for long pe- 
riod of times (up to several months). 

In order to provide explanations, we study this phe- 
nomenon both at the IP, and at the Autonomous System lev- 
els. We show that this renewal of IP addresses is partially 
caused by a BGP routing dynamics, altering paths between 
existing ASes. Furthermore, we conjecture that an intra AS 
routing dynamics is another cause of this phenomenon. 



1 Introduction 

Most works aimed at mapping the Internet IP-level 
topology rely on traceroute-like probes, for instance 
llll [T9l. These probes are repeated periodically for large 
amounts of time, each round of measurement leading to a 
partial and biased view of the topology. It is indeed known 
that, because of phenomena such as load balancing ll24l . it 
is not possible to see everything that can be seen from a 
monitor in a single round. One round discovers only one 
path among several between the monitor and a destination. 
Snapshots of the Internet topology are therefore constructed 
by merging series of measurement rounds. This relies on the 
assumption that it is possible to explore a given part of the 
topology with a finite number of probes. 

We focus here on what can be observed by running 
traceroute-like probes at a high frequency from a sin- 
gle monitor to a constant destination set 1 1 1 1. We show that 
the observed view of the topology is constantly evolving 
at a rate much higher than expected. For instance, during 



the last week of two-months measurements, we discovered 
1 118 new IP addresses (on a total of 29 100) that had never 
been observed before. 

These observations imply in particular that it is never 
possible to discover everything that can be seen from a mon- 
itor; also, aggregating data from such measurements leads 
to topology maps with much obsolete information. 

In this paper we describe and study this phenomenon. 
Though we do not obtain a conclusive explanation, we show 
that a fast routing dynamics is the cause. 

2 Data set 

We use the data described in 111]. Measurements were 
conducted from more than 150 monitors. Each monitor had 
a destination set that stayed the same for the whole duration 
of the measurements. The measurements then consisted in 
periodically running the tracetree tool, which collects 
a routing tree from a given monitor to a set of destinations 
in a traceroute-like manner The measurements were 
conducted with a high frequency (typically about 100 mea- 
surement rounds per day), for a long period of time (from 
weeks to several months, depending on the monitor). For 
more details, see ifTTl . 

Our goal is not to study the data in detail or compare all 
the data sets obtained from all monitors. On the contrary, 
we insist on the fact that we observed similar phenomena 
for each of them: while the exact details do of course de- 
pend on the particular monitor under study, our observations 
were qualitatively the same in all cases. 

In this paper, we have therefore chosen to illustrate our 
results by using a single monitor and the corresponding data 
set. It consists of a single two-month measurement (June 
and July 2007) from a monitor located in Japan, at a rate of 
approximately 100 rounds per day, leading to 5 891 rounds 
in total. The destination set consists of 3 000 IP addresses 
chosen randomly that replied to ICMP echo request 
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Figure 1. Number of IP addresses observed 
in eachi measurement round. 
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Figure 2. Number of IP addresses observed 
since the beginning of the measurements as 
a function of time. 



Figure 3. Number of IP addresses that were 
observed before time t and are never ob- 
served after t as a function of t. 
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Figure 4. Number of distinct IP addresses 
seen with stable destinations only. 



messages at selection time. 

3 IP addresses renewal 

In this section, we describe the evolution of the set of ob- 
served IP addresses. Figure [T]presents the number of IP ad- 
dresses observed in each measurement round. All values in 
this plot are centered around a same value (close to 12 000) 
except some downward peaks which indicate rounds with 
less IP addresses than usual. These peaks could indicate a 
loss of connectivity at or near the monitor, or an event such 
as a major routing change or failure. Studying this is how- 
ever out of the scope of this paper 

The next question is whether we observe the same IP 
addresses in all rounds. This leads to the plot in Figure [2] 
which presents the number of distinct IP addresses observed 
since the beginning of the measurements as a function of 
time. This plot gives evidence for a striking fact: measure- 
ments continuously discover new IP addresses never seen 
beforeE] 

Though it seems natural to observe some new IP ad- 
dresses after some measurement time, this happens here 
with a surprisingly high rate: during the second month of 

' Other, six-months long, measurements exhibit the same behavior. 



the measurements, around 150 new IP addresses are dis- 
covered each day. 

We have seen (Figure [TJ that the number of IP addresses 
seen at each round is not increasing. The continuous dis- 
covery of new IP addresses must therefore come together 
with a continuous disappearance of addresses that we cease 
to observe after some time. Figure [3]presents this. The dis- 
appearances are indeed symmetric with the observation of 
new IP addresses|3 

One possible cause for these observations would be that 
some routers reply with random IP addresses; we will show 
in the next section that it is not the case. Another possi- 
ble cause would be that some of our destinations are dy- 
namic addresses, i.e. dynamically allocated to different 
hosts over time. Since such hosts could be in different loca- 
tions, depending on network operation, these dynamic ad- 
dresses could lead us to discover new paths and as a result 
new addresses in the measurements. 

Figure |4] shows that this is not the case. The idea is 
to select the destinations that were stable during the mea- 
surements. Using a similar approach to geolocation studies 
(see for instance |141), we considered that a destination ad- 
dress is not dynamic if the address immediately before it 

^The plots of Figures[2]and[3]have a similar shape if rotated at a 180° 
angle. 
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Figure 5. Distribution of thie number of 
rounds in whiichi eachi IP address was ob- 
served. 
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Figure 6. Number of IP addresses observed 
in at least 2, 10, 50, 200 or 1000 different 
rounds (top to bottom) since the beginning 
of the measurements. 

in the measurements is always the same; 35 out of 3 000 
destinations satisfied this condition. We then simulated the 
measurements by keeping only these stable addresses, and 
we still clearly see a constant appearance of new IP ad- 
dresses: dynamic addresses are therefore not the cause of 
this renewal. Note that our criterion for characterizing sta- 
ble addresses is very restrictive; we do not imply that the 
addresses that do not satisfy it are dynamic. We tested other 
criteria, which provided the same results. 

In summary, we observe a continuous, high-rate renewal 
of the set of IP addresses observed from a monitor, and 
showed that it is not a measurement artifact, but an actual 
property of the IP-level topology. This implies that repeat- 
ing measurements, even for long periods of time, cannot 
converge to a full view of what can be observed. More- 
over, aggregating data obtained during consecutive rounds 
to construct a topology map is not satisfying, because this 
means grouping up-to-date data together with obsolete one. 

4 Recurring IP addresses 

In this section we ask whether we observe IP addresses 
with consistency, or if we only see them in a very small 



Figure 7. Number of distinct ASes observed 
since the beginning of the measurements: 
real number (thin line), and estimated with a 
simulation from Route Views data (thick line). 

number of rounds. For any number of rounds x. Figure [5] 
presents the number of IP addresses that were observed in 
exactly x different rounds during the measurements. 

This distribution shows that a large number of IP ad- 
dresses are very volatile: 3 030 IP addresses are indeed ob- 
served only once during these two-month measurements. 
On the other hand, a significant number of IP addresses ap- 
pear recurrently: they are seen in almost each round during 
the measurements. 

The presence of a large number of highly volatile IP 
addresses naturally induces the question of whether these 
addresses are the cause of the renewal of the observed IP 
addresses. Figure |6] answers this question. It presents the 
number of distinct IP addresses observed since the begin- 
ning of the measurements, restricted to recurring addresses 
that were observed in at least 2, 10, 50, 200 or 1 000 differ- 
ent rounds. Though the slope of these plots are smaller than 
the one in Figure |2] we continuously observe new recurring 
addresses. As a matter of fact, if we only consider IP ad- 
dresses observed in least 1 000 rounds (out of 5 891 rounds), 
we still observe a non-negligible renewal. This shows that 
the constant observation of new IP addresses is not caused 
by volatile addresses only, and that recurring addresses are 
also renewed. Moreover, this means that routers replying 
with random addresses are not the cause of this renewal: 
the corresponding addresses would only be observed a very 
small number of times, and we showed that such addresses 
are not the main cause of our observations. 

5 Autonomous Systems 

We now study the same question on a different scale: do 
we observe the same type of behavior when we consider 
ASes rather than IP addresses? We associate each IP ad- 
dress seen in the measurements to its AS using the Team 
Cymru database 121]. Figure |7] (thin line) then presents the 
number of distinct ASes observed during the measurements. 
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Figure 9. Number of IP addresses observed 
in tfie largest AS as a function of time. 

As expected, we see fewer ASes than IP addresses. How- 
ever, we observe the same behavior as before: each round 
sees a more or less constant number of ASes (close to 950, 
not shown here), and we continuously discover new ASes 
during the measurements. 

This can be considered as a partial explanation of what 
we observe at the IP level: if we discover new ASes, it 
is only natural that we should discover new IP addresses 
within them. 

To study this question further, we asked if all the ASes 
are equivalent in the measurements. Figure|8]shows the dis- 
tribution of the observed size of ASes: for each AS seen, 
we computed how many different IP addresses we observed 
within it, and plotted the corresponding distribution. As we 
can see, this distribution is highly heterogeneous: for more 
than 100 ASes (on a total of 1 023), we observe only a single 
IP address, while 3 ASes contain more than one thousand of 
observed IP addresses. 

The presence of ASes with very large observed sizes in 
the measurements naturally led to the question of whether 
we also observe a renewal of observed IP addresses within 
a single AS. Figure [9] shows the number of IP addresses 
observed since the beginning of the measurements in the 
largest observed AS[^ Again in this case we observe the 

^^This is the Level3 Communications AS (number 3 356), containing 
1 333 IP addresses (on a total of 29 100). 



same type of behavior: we continuously observe new IP 
addresses in this AS. 

From these observations we can derive the following 
conclusion: we observe a constant renewal, both at the AS 
level, with the constant discovery of new ASes, and within 
single ASes, discovering new IP addresses in already seen 
ASes. This therefore allows us to break down the constant 
appearance of new IP addresses between these two factors. 

However, this does not explain why we should observe 
new ASes, or new IP addresses within previously observed 
ASes. In particular, the question that naturally arises about 
the newly discovered ASes is whether they are new, i.e. cre- 
ated after the beginning of the measurements]^ or if they are 
pre-existing ASes that become visible to the measurements 
due to BGP routing dynamics. 

To study this question further, we used data from the 
Route Views project (iTl. This project makes publicly 
available a recording of BGP routing tables from several 
hosts. This data allowed us to simulate the measurements 
from an AS/BGP point of view: we chose a Route Views 
monitor located close to our monitorj^ then selected the 
routing tables corresponding to the period of the measure- 
ments. For each routing table, we extracted the ASes be- 
longing to BGP paths corresponding to IP prefixes of the 
destinations. We thus obtained the set of all possible ob- 
servable ASes for each routing table. 

Figure [T] (thick line) then presents the number of distinct 
observable ASes since the beginning of the measurements, 
obtained through our simulations. We obtain a similar slope 
with both methods, which confirms their validity. Note that 
the numbers obtained with the Route Views data are larger 
than the ones obtained from Team Cymm. This is due to 
the fact that the Route Views data represents, at each mo- 
ment, the set of all possible AS paths allowing to reach the 
destinations; instead, the Team Cymru data is directly ex- 
tracted from the measurements, and therefore provides only 
a single path to each destination. 

The use of the Route Views data moreover allows us to 
go further. We observed 1 072 ASes in total, 72 of which 
were discovered after the beginning of the measurements. 
Out of these ASes, we found out that 70, i.e. all but two 
of them, were present in the first routing table (but did not 
belong to AS paths leading to the destinations). This means 
that these 70 ASes were already existing at the beginning 
of the measurements, but became visible because of BGP 
routing changes. 

Finally, we are able to conclude that, at the AS level, our 
observations are caused by a dynamics of the BGP routing, 
causing pre-existing ASes to become visible on the paths 

"^During the year 2007, around 250 ASes were created every month, see 
http : / /www . cidr- report .org/as2.0/ 

■'This is host route-views.wide.routeviews.org, located in one peering 
point of the AS where our monitor is located. 
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between the monitor and the destinations. 
6 Related work 

Much work has focused on the measurement bias created 
by mapping the Internet topology with traceroute-Hke 
probes. The majority of these works concern the fact that 
running probes from a Hmited number of monitors misses 
some Hnks and/or creates a bias on the observed degrees 
of the nodes, see for instance |l9l[Il[3]|20l|5l. Others have 
studied the fact that tools such as traceroute may report 
incomplete and/or false information, see for instance 1241 



It is an acknowledged problem in the field that the In- 
ternet topology evolves with time and that this may create 
a bias in the measurements. However, though some works 
have studied the dynamics of the topology, at the IP or AS 
level (see for instance Il3l[l8l|25l[12][l3l[l7l[l6l|4l|2l[l), 
up to our knowledge only one paper has attempted to study 
the bias caused by this dynamics on the measurements lfT3l . 
The authors of this paper study the AS-level topology, and 
design methods for evaluating with a certain degree of con- 
fidence if an observed topology change is a real change 
or not. Though their approach and some of their obser- 
vations are similar to ours, they study the AS-level topol- 
ogy whereas we study the IP-level topology, and consider 
time-scales much longer, and hence a much coarser time 
resolution, than we do. The phenomena playing a role in 
their observations are therefore different than in our case: 
they decompose their observations into a birth/death pro- 
cess, coupled with transient routing dynamics. 

Finally, another work ifTol studied the measurement 
process of different complex networks, including Internet 
maps. They observed that, for the skitter data |i2J, measure- 
ments continuously discover new IP addresses. This is sim- 
ilar to our observations, though other causes probably play 
a role in this: the skitter data is collected from several moni- 
tors, at a lower frequency and for larger time scales than the 
data we study here. 

7 Conclusion and perspectives 

In this paper, we bring to light a surprising phenomenon: 
when performing periodic traceroute-like measure- 
ments from a single monitor to a fixed set of destinations, 
the obtained view of the topology never stabilizes. On the 
contrary, we continuously observe new IP addresses at a rate 
much higher than expected. This phenomenon is observed 
with various monitors and destination sets, and seems to be 
universal. 

We described this phenomenon in details and attempted 
to determine its cause. We first ruled out some possible 
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Figure 10. Complementary cumulative distri- 
bution of the time elapsed between the first 
and last discovery of IP addresses over all 
monitors. 



explanations: dynamic IP addresses among the destinations, 
and routers answering with many random addresses. This 
showed that this phenomenon is not a measurement artifact, 
but an actual property of the IP-level topology. 

We were able to break down the observation of new IP 
addresses into two factors: a constant observation of new 
ASes, coupled with an observation of new IP addresses 
within already discovered ASes. 

From the record of routing tables by the Route Views 
project, we concluded that the discovery of new ASes is 
caused by the dynamics of the BGP routing. These ASes 
were in fact created before the beginning of the measure- 
ments, and became visible as a consequence of a change in 
routing paths. 

Following these conclusions on AS renewal, we conjec- 
ture that the same cause holds for the IP address renewal, 
and that newly discovered addresses were already allocated 
at the beginning of the measurements, and became visible 
because of routing changes. 

Some preliminary results on this question confirm this 
hypothesis. We combined measurements performed from 
several monitors in order to test if all monitors discover the 
new IP addresses at the same time. We chose 11 monitors 
that used the same destination set, and studied addresses ob- 
served with two monitors or more. For each such address, 
we wrote down the time it was discovered by each moni- 
tor individually and then computed the interval between the 
first and the last of these discoveries. For instance, an ad- 
dress seen with two monitors, first observed with the first 
monitor at 7 AM and then discovered by the second mon- 
itor at 11 AM the same day, will give an interval of four 
hours. 



Figure 10 presents the complementary cumulative distri- 
bution of these interval sizes. We observe that a large num- 
ber of IP addresses discovered by a given monitor were in 
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fact observed a significant duration before with other mon- 
itors. Among the 32 228 (out of 40 076) IP addresses seen 
with at least two monitors, 22 897 were observed with one 
monitor more than 200 hours before they were discovered 
by another one, which means that these addresses existed 
for a long time before they were discovered. Note that this 
does not tell us whether other addresses existed before their 
discovery or were created at this time. This indicates that 
a large number of the IP addresses discovered by a given 
monitor existed in fact for a significant time before their 
discovery, and that a routing dynamics between existing ad- 
dresses plays a strong role in our observations. 

This work should be pursued in several directions. First, 
we want to fully characterize and understand the renewal of 
observed IP addresses. We think that it is possible to per- 
form new measurements, specifically designed for answer- 
ing the question of whether newly discovered IP addresses 
existed prior to their discovery or not. Another direction 
would be to perform simulations, which would open the 
way to an accurate modeling of the phenomena causing the 
renewal of IP addresses. 

Second, our work indicates that there is no perfect solu- 
tion for constructing maps of the Internet topology while a 
single measurement round does not discover everything that 
can be seen from a monitor (because of load balancing for 
instance), aggregating data from several consecutive rounds 
puts together obsolete and up-to-date data. It would be of 
prime interest to determine if a best compromise exists, al- 
lowing to construct the most accurate maps, for instance 
by tuning the number of measurement rounds and their fre- 
quency. 
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